CLAIMS 



1 . An image recognition apparatus which compares an object image containing a 
plurality of objects with a model image containing a model to be detected and extracts 
the model from the object image, the apparatus comprising: 

feature point extracting means for extracting a feature point from each of the 
object image and the model image; 

feature quantity retention means for extracting and retaining, as a feature 
quantity, a density gradient direction histogram at least acquired from density gradient 
information in a neighboring region at the feature point in each of the object image 
and the model image; 

feature quantity comparison means for comparing each feature point of the 
object image with each feature point of the model image and generating a 
candidate-associated feature point pair having similar feature quantities; and 

model attitude estimation means for detecting the presence or absence of the 
model on the object image using the candidate-associated feature point pair and 
estimating a position and an attitude of the model, if any, 

wherein the feature quantity comparison means itinerantly shifts one of the 
density gradient direction histograms of feature points to be compared in density 
gradient direction to find distances between the density gradient direction histograms 
and generates the candidate-associated feature point pair by assuming a shortest 
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distance to be a distance between the density gradient direction histograms. 

2. The image recognition apparatus according to claim 1 , 

wherein the feature quantity retention means extracts and retains, as the 
feature quantity, an average density gradient vector for each of plurality of partial 
regions into which the neighboring region is further divided, and 

the feature quantity comparison means generates the candidate-associated 
feature point pair based on a distance between density gradient direction histograms 
for the feature points to be compared and on similarity between feature vectors which 
are collected in the neighboring region as average density gradient vectors in each of 
the partial regions. 

3. The image recognition apparatus according to claim 2, wherein the feature 
quantity comparison means generates a provisional candidate-associated feature point 
pair based on a distance between the density gradient direction histograms for the 
feature points to be compared and, based on the similarity between feature vectors, 
selects the candidate-associated feature point pair from the provisional 
candidate-associated feature point pair. 

4. The image recognition apparatus according to claim 3, wherein the feature 
quantity comparison means uses a rotation angle equivalent to a shift amount giving 
the shortest distance to correct a density gradient direction of a density gradient vector 
in the neighboring region and selects the candidate-associated feature point pair from 
the provisional candidate-associated feature point pair based on similarity between the 



feature vectors in a corrected neighboring region. 

5. The image recognition apparatus according to claim 1 5 wherein the model 
attitude estimation means repeatedly projects an affine transformation parameter 
determined from three randomly selected candidate-associated feature point pairs onto 
a parameter space and finds an affine transformation parameter to determine a position 
and an attitude of the model based on an affine transformation parameter belonging to 
a cluster having the largest number of members out of clusters formed on a parameter 
space. 

6. The image recognition apparatus according to claim 5, wherein the model 
attitude estimation means assumes a centroid for the cluster having the largest number 
of members to be an affine transformation parameter to determine a position and an 
attitude of the model. 

7. The image recognition apparatus according to claim 5, wherein the model 
attitude estimation means assumes a candidate-associated feature point pair giving the 
affine transformation parameter belonging to a cluster having the largest number of 
members to be a true candidate-associated feature point pair and uses the true 
candidate-associated feature point pair for least squares estimation to find an affine 
transformation parameter for determining a position and an attitude of the model. 

8. The image recognition apparatus according to claim 1 , further comprising: 
candidate-associated feature point pair selection means for creating a rotation 

angle histogram concerning a rotation angle equivalent to a shift amount giving the 
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shortest distance and selects a candidate-associated feature point pair giving a rotation 
angle for a peak in the rotation angle histogram from the candidate-associated feature 
point pair generated by the feature quantity comparison means, 

wherein the model attitude estimation means detects the presence or absence 
of the model on the object image using a candidate-associated feature point pair 
selected by the candidate-associated feature point pair selection means and estimates a 
position and an attitude of the model, if any. 

9. The image recognition apparatus according to claim 1, further comprising: 
candidate-associated feature point pair selection means for performing 

generalized Hough transform for a candidate-associated feature point pair generated 
by the feature quantity comparison means, assuming a rotation angle, enlargement and 
reduction ratios, and horizontal and vertical linear displacements to be a parameter 
space, and selecting a candidate-associated feature point pair having voted for the 
most voted parameter from candidate-associated feature point pairs generated by the 
feature quantity comparison means, 

wherein the model attitude estimation means detects the presence or absence 
of the model on the object image using a candidate-associated feature point pair 
selected by the candidate-associated feature point pair selection means and estimates a 
position and an attitude of the model, if any. 

1 0. The image recognition apparatus according to claim 1 , wherein the feature 
point extraction means extracts a local maximum point or a local minimum point in 



second-order differential filter output images with respective resolutions as the feature 
point, i.e., a point free from positional changes due to resolution changes within a 
specified range in a multi-resolution pyramid structure acquired by repeatedly 
applying smoothing filtering and reduction resampling to the object image or the 
model image. 

1 1 . An image recognition apparatus which compares an object image containing a 
plurality of objects with a model image containing a model to be detected and extracts 
the model from the object image, the apparatus comprising: 

feature point extracting means for extracting a feature point from each of the 
object image and the model image; 

feature quantity retention means for extracting and retaining a feature quantity 
in a neighboring region at the feature point in each of the object image and the model 
image; 

feature quantity comparison means for comparing each feature point of the 
object image with each feature point of the model image and generating a 
candidate-associated feature point pair having similar feature quantities; and 

model attitude estimation means for detecting the presence or absence of the 
model on the object image using the candidate-associated feature point pair and 
estimating a position and an attitude of the model, if any, 

wherein the model attitude estimation means repeatedly projects an affine 
transformation parameter determined from three randomly selected 
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candidate-associated feature point pairs onto a parameter space and finds an affine 
transformation parameter to determine a position and an attitude of the model . based 
on an afFine transformation parameter belonging to a cluster having the largest number 
of members out of clusters formed on a parameter space. 
5 12. The image recognition apparatus according to claim 1 1, wherein the model 
attitude estimation means assumes a centroid for the cluster having the largest number 
of members to be an affine transformation parameter to determine a position and an 
attitude of the model. 

1 3 . The image recognition apparatus according to claim 1 1 , wherein the model 

1 o attitude estimation means assumes a candidate-associated feature point pair giving the 

affine transformation parameter belonging to a cluster having the largest number of 
members to be a true candidate-associated feature point pair and uses the true 
candidate-associated feature point pair for least squares estimation to find an affine 
transformation parameter for detemiining a position and an attitude of the model. 
15 14. The image recognition apparatus according to claim 1 1 , further comprising: 
candidate-associated feature point pair selection means for performing 
generalized Hough transform for a candidate-associated feature point pair generated 
by the feature quantity comparison means, assuming a rotation angle, enlargement and 
reduction ratios, and horizontal and vertical linear displacements to be a parameter 

2 0 space, and selecting a candidate-associated feature point pair having voted for the 

most voted parameter from candidate-associated feature point pairs generated by the 
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feature quantity comparison means, 

wherein the model attitude estimation means detects the presence or absence 
of the model on the object image using a candidate-associated feature point pair 
selected by the candidate-associated feature point pair selection means and estimates a 
5 position and an attitude of the model, if any. 

15. The image recognition apparatus according to claim 1, wherein the feature 
point extraction means extracts a local maximum point or a local minimum point in 
second-order differential filter output images with respective resolutions as the feature 
point, i.e., a point free from positional changes due to resolution changes within a 

1 o specified range in a multi-resolution pyramid structure acquired by repeatedly 

applying smoothing filtering and reduction resampling to the object image or the 
model image. 

16. An image recognition method which compares an object image containing a 
plurality of objects with a model image containing a model to be detected and extracts 

15 the model from the object image, the method comprising: 

a feature point extracting step of extracting a feature point from each of the 
object image and the model image; 

a feature quantity retention step of extracting and retaining, as a feature 
quantity, a density gradient direction histogram at least acquired from density gradient 

2 o infonnation in a neighboring region at the feature point in each of the object image 

and the model image; 
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a feature quantity comparison step of comparing each feature point of the 
object image with each feature point of the model image and generating a 
candidate-associated feature point pair having similar feature quantities; and 

a model attitude estimation step of detecting the presence or absence of the 
5 model on the object image using the candidate- associated feature point pair and 
estimating a position and an attitude of the model, if any, 

wherein the feature quantity comparison step itinerantly shifts one of the 
density gradient direction histograms of feature points to be compared in density 
gradient direction to find distances between the density gradient direction histograms 

1 o and generates the candidate-associated feature point pair by assuming a shortest 

distance to be a distance between the density gradient direction histograms. 
17. An image recognition method which compares an object image containing a 
plurality of objects with a model image containing a model to be detected and extracts 
the model from the object image, the method comprising: 
15 a feature point extracting step of extracting a feature point from each of the 

object image and the model image; 

a feature quantity retention step of extracting and retaining a feature quantity 
in a neighboring region at the feature point in each of the object image and the model 
image; 

2 0 a feature quantity comparison step of comparing each feature point of the 

object image with each feature point of the model image and generating a 



candidate-associated feature point pair having similar feature quantities; and 

a model attitude estimation step of detecting the presence or absence of the 
model on the object image using the candidate-associated feature point pair and 
estimating a position and an attitude of the model, if any 3 

wherein the model attitude estimation step repeatedly projects an affine 
transformation parameter determined from three randomly selected 
candidate-associated feature point pairs onto a parameter space and finds an afifine 
transformation parameter to determine a position and an attitude of the model based 
on an afifme transformation parameter belonging to a cluster having the largest number 
of members out of clusters formed on a parameter space. 

1 8. An autonomous robot apparatus capable of comparing an input image with a 
model image containing a model to be detected and extracting the model from the 
input image, the apparatus comprising: 

image input means for imaging an outside environment to generate the input 

image; 

feature point extracting means for extracting a feature point from each of the 
input image and the model image; 

feature quantity retention means for extracting and retaining, as a feature 
quantity, a density gradient direction histogram at least acquired from density gradient 
information in a neighboring region at the feature point in each of the input image and 
the model image; 
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feature quantity comparison means for comparing each feature point of the 
input image with each feature point of the model image and generating a 
candidate-associated feature point pair having similar feature quantities; and 

model attitude estimation means for detecting the presence or absence of the 
model on the input image using the candidate-associated feature point pair and 
estimating a position and an attitude of the model, if any, 

wherein the feature quantity comparison means itinerantly shifts one of the 
density gradient direction histograms of feature points to be compared in density 
gradient direction to find distances between the density gradient direction histograms 
and generates the candidate-associated feature point pair by assuming a shortest 
distance to be a distance between the density gradient direction histograms. 
19. An autonomous robot apparatus capable of comparing an input image with a 
model image containing a model to be detected and extracting the model from the 
input image, the apparatus comprising: 

image input means for imaging an outside environment to generate the input 

image; 

feature point extracting means for extracting a feature point from each of the 
input image and the model image; 

feature quantity retention means for extracting and retaining a feature quantity 
in a neighboring region at the feature point in each of the input image and the model 
image; 
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feature quantity comparison means for comparing each feature point of the 
input image with each feature point of the model image and generating a 
candidate-associated feature point pair having similar feature quantities; and 

model attitude estimation means for detecting the presence or absence of the 
model on the input image using the candidate-associated feature point pair and 
estimating a position and an attitude of the model, if any, 

v/herein the model attitude estimation means repeatedly projects an affine 
transformation parameter determined from three randomly selected 
candidate-associated feature point pairs onto a parameter space and finds an affine 
transformation parameter to determine a position and an attitude of the model based 
on an affine transformation parameter belonging to a cluster having the largest number 
of members out of clusters formed on a parameter space. 



